Introduction
Keyhole surgery (e.g., laparoscopic and thoracoscopic procedures) has been widely adopted due to its advantages, such as low postoperative complication rates, less pain, short recovery time, and excellent cosmesis [
1,
2]. Manipulation challenges in these procedures, including limited visual perception, reduced distal dexterity, reversed hand–eye coordination, and hindered haptic sensing, have inspired a vast wave of developments on surgical robotic systems to improve their assistance and obtain superhuman capabilities [
3–
5].
Robotic systems for keyhole and endoscopic procedures usually consist of a patient-side cart and a surgeon console (Fig. 1). As presented in the “Configuration and actuation of the patient-side cart” section, the patient-side cart often involves a few surgical manipulators that maneuver a laparoscope (or a thoracoscope) and two to three surgical instruments. A laparoscope and thoracoscope are the same in their composition. During a procedure, a laparoscope is inserted into the abdomen, whereas a thoracoscope is inserted into the thorax. Surgical manipulators can have various forms and actuation schemes. Given that the control scheme of surgical robotic systems is mainly teleoperation, the surgeon console primarily consists of a 2D/3D display, a pair of master devices, and a touchscreen/keyboard with a few pedals for user inputs. The system sometimes includes a device cart for additional display, electrical surgical equipment, and data processing computers serving as an information hub during surgery.
Surgeries other than keyhole and endoscopic procedures utilize different treatment techniques, which in turn lead to surgical robotic systems with different characteristics. For example, orthopedic and neurosurgical robots emphasize on accurate registration and constrained intraoperative path planning [
6,
7], whereas robotic percutaneous interventional procedures focus on instrument compactness (e.g., via the use of concentric tubes [
8,
9]) and magnetic resonance imaging compatibility [
10,
11].
This paper presents a comprehensive review of the state-of-the-art surgical robotic systems for laparoscopic, thoracoscopic, and endoscopic procedures. The search methodology is reported in the “Search methodology” section. The system configurations, actuation schemes, and design considerations are analyzed in the “Configuration and actuation of the patient-side cart” section, and additional sensors in robotic surgery for haptic sensation and visual perception are discussed in the “Additional sensors in robotic surgery” section. The control approaches, including teleoperation, surgical automation, and system autonomy, are elaborated in the “Control approaches” section. Future developments and perspectives are discussed in the “Challenges and future perspectives” section to inspire future studies. The conclusion is provided in the “Conclusions” section.
Existing survey papers either span too broadly or only focus on a specific topic. For example, survey papers [
3,
5,
12] cover a wide spectrum of advancements in surgical robotics, including orthopedics, neurosurgery, laparoscopy, catheterization, and percutaneous procedures. These papers fail to systematically introduce the state-of-the-art achievements of surgical robotic systems for keyhole and endoscopic procedures. On the other hand, review papers (e.g., References [
6,
7,
11,
13,
14]) either focus on particular applications (e.g., digestive procedures, arthroplasty, neurosurgery, and urology) or design approaches (e.g., via the use of continuum mechanisms). Surgical robotic systems for keyhole and endoscopic procedures share some common characteristics, such as dual-arm manipulation, visual guidance, and teleoperation. This review paper attempts to help readers form a systematic and comprehensive perception about the state-of-the-art achievements such that future studies can be planned beyond the achievements.
Search methodology
This literature review focuses on the enabling technologies and system integrations of surgical robots for keyhole and endoscopic procedures. The searches were performed on Thomson Reuters Web of Science Core Collection and IEEE Xplore to find relevant literatures in English. The searches in Thomson Reuters Web of Science Core Collection were specified in the areas of robotics and engineering using the advanced search option, whereas those in the IEEE database were conducted using the free-text protocol. The search terms applied are listed in Table 1 for each individual subsection as these subsections are relatively standalone. Literatures with at least five average annual citations within the first 100 most-cited papers were identified. Additional relevant records were included from the authors’ literature library.
The authors reviewed the abstracts of the identified literatures to exclude records in irrelevant topics (e.g., orthopedics, vascular intervention, or neurology), technologies for manual tool designs, surgical platform descriptions with no or little implementation details, clinical reports with no or little engineering details, and results with limited practical significance. Considering that this review does not intend to exhaustively include all relevant papers, sometimes only representative articles and milestone works (e.g., one or two most cited papers) were included. All the included literatures were then presented according to the structure of this paper. This selection process was based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) criteria and followed the PRISMA flowchart in Fig. 2 for each subsection. The applied search terms and statistics for each subsection are summarized in Table 1. A total of 219 papers were included in the “Configuration and actuation of the patient-side cart,” “Additional sensors in robotic surgery,” and “Control approaches” sections in this review.
Configuration and actuation of the patient-side cart
A patient-side cart often consists of a few surgical manipulators. As shown in Fig. 3A, the surgical manipulators maneuver a laparoscope (or a thoracoscope) and two to three straight surgical instruments with distal wrists through a few incisions made on the abdominal or thoracic wall to visualize the surgical site and perform treatments. The distal wrists integrated in the surgical instruments are mainly used to increase the distal dexterity (i.e., the ability of orienting the surgical end effector as desired).
The paradigm in Fig. 3A involves a few skin incisions, and it is referred to as a multi-port procedure. In this procedure, the surgical manipulators shall realize remote-center-of-motion (RCM) movements or multiple degree-of-freedom (DoF) intracorporeal motions. These manipulators with different forms and actuation schemes are discussed in the “RCM movements” and “Intracorporeal movements” sections, respectively.
The desire of further reducing surgical invasiveness leads to the proposal of single-port procedures [
15] and natural orifice transluminal endoscopic surgery (NOTES) [
16]. In single-port procedures, surgical instruments have to be inserted into the abdomen through a single incision to access the surgical site. It is challenging for surgeons to become proficient in crossed hand–eye coordination; thus, robotic assistance is needed. The inserted instruments can be arranged in two ways: the X configuration as shown in Fig. 3B and the Y configuration as shown in Fig. 3C. In the X configuration, the technical approaches from the multi-port surgical system may be directly applied, but collisions between the surgical manipulators and the instruments should be carefully avoided as the instruments’ workspace is mutually limited. In the Y configuration, the surgical instruments shall be unfolded to form a working pose. Collisions are minimized, but the instruments’ payload capability is of concern because the external load has a much longer moment arm compared with the moment arm of the actuation force from the cable inside the instrument’s stem. The existing state-of-the-art systems for single-port procedures are summarized in the “Systems for single-port procedures” section.
NOTES procedures access a surgical site via confined and curved natural orifices and can potentially avoid the use of any skin incision. However, the difficulty in instrumentation limits the application of NOTES procedures. Representative endoscopic and NOTES systems are summarized in the “Systems for endoscopic and NOTES procedures” section.
Systems for multi-port procedures
Surgical instruments are individually inserted through the incisions in the abdominal or thoracic wall during a multi-port procedure. They can either have a straight stick-like form with a distal wrist or a multi-DoF structure.
In the former case, the instruments shall be manipulated by the patient-side manipulators for RCM movements, where the instrument is pivoted with respect to the skin incision in order to avoid tearing the abdominal wall. Then, the instrument is given four DoFs by the extracorporeal manipulator, including the pitch and yaw DoFs, as well as the translation and rotation along and about the instrument’s axis. The forms and structures of RCM manipulators are reviewed in the “RCM movements” section. A distal wrist is often integrated to enhance the distal dexterity (i.e., increase the number of DoFs). Various wrist designs are reported in the “Wrist design and actuation” section.
A number of designs on multi-DoF surgical instruments are reviewed in the “Intracorporeal movements” section. These instruments realize dexterous intracorporeal movements without using RCM manipulators.
RCM movements
In the multi-port procedure, a stick-like instrument requires RCM movements: it can be tilted in the pitch and yaw directions with respect to the skin incision. These RCM motions are realized by the patient-side surgical manipulators, either by using an RCM mechanism or via controlling a manipulator’s multiple joints in a coordinated manner (i.e., programmable RCM movements). Several RCM mechanisms are shown in Fig. 4.
In the parallelogram-based RCM mechanism in Fig. 4A with the representative implementations in the da Vinci system [
17] and the REVO-I system [
18], the rotation from the actuator at position
A is transmitted to position
O via the two parallelograms. The second actuator at position
B enables a rotation about the
BO axis. To reduce the mass of the RCM mechanisms so as to ease the dynamic control of the instrument’s movements, timing belts [
19] and cables [
20,
21] are used to realize equivalent parallelograms for RCM movements as shown in Fig. 4B. However, cable elongation under loads can be challenging for the manipulator accuracy. To increase the structural rigidity, a parallel structure is used as shown in Fig. 4C, where two prismatic actuators are used to provide the 2-DoF rotation about the RCM point [
22].
The serially connected spherical linkage as shown in Fig. 4D can also be used [
23,
24]. To further increase the stiffness, a parallel spherical linkage was proposed as shown in Fig. 4E [
25]. The design in Fig. 4E realizes 3-DoF motions, including a rotation about the instrument axis. This axial rotation is usually not realized by the RCM mechanisms, for example, the ones in Fig. 4A–4D and 4F.
RCM mechanisms based on goniometer arc tracks are also possible candidates [
26,
27]. However, the arc length is proportional to the angular motion range, and large motion ranges lead to bulky designs and inconvenience in patient-side deployment.
Programmable RCM movements are realized via synchronized control of a manipulator’s multiple joints. The manipulators can have either serial [
28–
31], parallel [
32–
34], or hybrid [
35,
36] structures.
The control for programmable RCM movements can be realized using a geometrical approach [
28] or instantaneous kinematics. The latter mainly includes the use of (1) null space projection [
31,
37,
38] or (2) extended Jacobian methods [
36,
39–
41]. Dynamic control methods that minimize the contact force between the instrument and the trocar can also be used [
42,
43]. Compared with RCM mechanisms, manipulators with programmable RCM movements provide convenience during the preoperative docking of the instruments. However, RCM mechanisms are considered safer and more reliable due to the mechanically constrained movements.
Besides the two aforementioned RCM approaches, passive RCM technique can also be adopted [
44–
46], wherein the extracorporeal manipulator is underactuated and the incision point helps to determine the pose of the inserted instrument. Passive RCM technique is safer, but the instrument’s accuracy is often affected by the compliance from the incision port in the abdominal wall under the pneumoperitoneum.
Wrist design and actuation
While the RCM movements of an instrument are realized by an extracorporeal manipulator, a wrist is often integrated at the distal end for dexterity enhancement such that suturing and knot tying can be more conveniently conducted.
Many wrists adopt serial-structured designs. The desire for design compactness and proximal actuator arrangement often leads to the choice of cable actuation [
17,
24,
47,
48], including the famous EndoWrist design shown in Fig. 5A. Shape memory alloy actuation is also a possible approach to realize compact wrist designs [
49,
50]. However, the motion responses are relatively slow. For example, the thermal exchange took approximately 8 s to complete in Reference [
49].
To enhance the wrist’s structural rigidity, serial and parallel linkages are also proposed. Representative examples using serially connected coupler actuation [
51] and a 3-PRS structure [
52] are shown in Fig. 5B and Fig. 5C, respectively. However, these linkage-actuated wrists may have limited motion ranges. For example, motion ranges of the pitch and yaw joints are only±40° in Reference [
51], whereas the parallel-linkage design in Reference [
52] has a pitch motion from −50° to 70° and a yaw motion of±64°. In comparison, the EndoWrist in the da Vinci system has a pitch motion range of±70° and a yaw motion range of±90° [
17].
The pulleys used in the cable-driven wrists and the pinned joints in the linkage-driven wrists have limited potentials in further miniaturization. Continuum mechanisms, which are coined in Reference [
53] and transmit forces and motions via the structures’ continuous deformations, have been explored. Examples of continuum wrists for multi-port procedures using a 2-DoF bending segment are shown in Figs. 5D [
27] and 5E [
54], featuring a multi-backbone design and a concentric-tube design, respectively.
The structural and modeling simplicity makes the design popular in many surgical robotic systems for single-port and NOTES applications [
55–
57]. To further improve the structural simplicity, a deformable wrist design was recently proposed as shown in Fig. 5F [
58]. The rigidity is also enhanced as elastic strips were used to replace elastic wires. In these continuum wrists, fatigue may seem a problem due to the repetitive bending. However, superelastic nitinol can easily undergo 105 cycles of deformation [
59], which fulfill the requirement for a multi-use instrument.
Intracorporeal movements
Besides stick-like instruments with an RCM manipulator, the patient-side surgical manipulator can also be designed to directly realize multi-DoF intracorporeal movements. In this case, only a lockable stand is required to hold these dexterous surgical manipulators to the entry ports of a patient’s abdomen. Then, the manipulators will no longer undergo RCM movements (e.g., swinging back and forth), eliminating the risks of mutual collisions of these bedside manipulators.
Surgical manipulators with multiple intracorporeal DoFs can have either articulated, continuum, or hybrid structures, as shown in Fig. 6.
In articulated designs, cable actuation [
47] or embedded motors [
28,
48] can be used. However, the use of embedded motors often results in bulky designs, and cable actuation is affected by tension keeping. Thus, the continuum structure has recently become a popular choice, where cable tension keeping can be eased by the elastic structure [
54,
60], or avoided in multi-elastic-backbone designs [
61,
62]. A hybrid design using a 2-DoF inverted dual continuum mechanism for intracorporeal translation and a 2-DoF EndoWrist for orientation has recently been proposed; this design combines the advantages of the articulated wrist that provides enhanced dexterity in confined spaces and the dual continuum mechanism that provides enhanced payload and reliability [
63].
Through the use of intracorporeal manipulators, the possibility of collision among the extracorporeal RCM manipulators is minimized. However, an intracorporeal manipulator should be fully inserted to deploy all its joints for multi-DoF movements. Its dexterity is hence limited at regions close to the abdominal wall. For wider clinical applications, most robotic systems for multi-port procedures still adopt the approach with RCM manipulators.
Systems for single-port procedures
General laparoscopic and thoracoscopic single-port procedures were introduced for reducing surgical invasiveness at the cost of increased manual manipulation difficulties and instrumentation complexity. Robotic assistance was introduced, and the adopted actuation schemes include cable/tendon actuation, embedded motorization, linkage-based transmission, and continuum mechanisms.
Cable actuation was adopted in the X and Y configurations, as shown in Fig. 3. The examples adopting the X configuration include the da Vinci Single-Site video endoscopic single-port access (VeSPA) surgical platform [
64] and the Samsung single-incision surgical system [
65]. To avoid manipulator collisions and instrument interference, many systems adopted the Y configuration, including the da Vinci SP system [
56], the Single Port Orifice Robotic Technology (SPORT) system [
66], the SurgiBot system [
67], and the single-port system with a flexible access tube [
68]. Some examples are shown in Fig. 7A. Cable/tendon actuation is a relatively mature technique. Ideally, pulleys should be applied at the manipulator’s joints to improve reliability and transmission smoothness. However, the pulley size becomes a major limit for further miniaturization of a multi-joint instrument. For this reason, some designs do not use pulleys and let the actuation cables slide against the rounded edges of the structural components. Besides cable wear, the introduced friction can cause actuation hysteresis and affect movement accuracy. In all the designs, the actuation cables shall be arranged distant enough from the joint axis to generate enough forces at the instrument’s distal end. Thus, most existing systems have an access tube diameter larger than 25 mm.
The following systems that use embedded miniature motors adopted the Y configuration: Single-Port lapaRoscopy bImaNual robot (SPRINT) system (Fig. 7B) [
69], Robot-Assisted Surgical Device (RASD) system (Virtual Incision Corporation, Lincoln, Nebraska, USA), the Single-Incision
in vivo Surgical Robot (SISR) system [
70], and the NISI single-port robotic system [
71]. The use of embedded motors inside the manipulator provides convenient modular joint designs. However, a sufficient motor power rating leads to a relatively large manipulator diameter. Most existing surgical systems adopting this approach need an access port larger than 30 mm in diameter. The sterilizability of the embedded motors may also increase the system costs.
Linkage can also be used for manipulator design in a single-port system. Representative examples include the single-port surgery (SPS) system (Fig. 7C) [
72] and PLAte-spring mechanism-based LAparoscopic Surgical robot (PLAS) system [
73]. The linkage-based manipulators generate relatively good payload performance. However, the inherent difficulties in designing multi-DoF spatial linkages, such as transmission and interference avoidance, often lead to limited instrument distal dexterity. For example, the maximal joint angle is only 45° in the SPS system [
72]. In addition, miniaturization of the manipulator may be limited by the size of the structural hinges. Most linkage-based systems require an incision diameter larger than 25 mm [
72,
73].
Continuum mechanisms transmit force and movement via structural deformation. Hence, all structural members undertake dual roles of structure and transmission. Using continuum mechanism to design a single-port surgical robot can therefore achieve better design compactness. Examples using this design include the Insertable Robotic Effector Platform (IREP) system with a 15 mm stem diameter [
74] and the SJTU Unfoldable Robotic System (SURS) with a 12 mm stem diameter (Fig. 7D) [
57]. In addition, the dual continuum mechanism proposed in Reference [
57] introduces actuation modularity and substantially increases the payload capability of the continuum surgical manipulator, making it an appealing candidate for future surgical robotic products. In the clinical scenario where only low payload capability is needed (e.g., performing ablation), a cardiothoracic endoscopic soft surgical robot was proposed [
75].
Systems for endoscopic and NOTES procedures
The delivery of surgical tools through narrow and curved natural orifices creates strict constraints on the tool’s distal dexterity and payload capability [
76]. The application of endoscopic and NOTES procedures is hence limited due to the challenges in tool instrumentation [
77]. Existing systems usually adopt the designs with articulated or continuum structures.
In articulated designs, structural compactness is of primary consideration. Besides cable actuation [
78,
79], surgical manipulators can also use embedded miniature motors [
80–
83]. The use of embedded miniature motors leads to beneficial design modularity and even system reconfigurability. These robots can move away from the cavity entrance and are typically magnetically anchored on the abdominal wall from an outside dock. However, further miniaturization of such robots is challenging due to various integrated actuation components.
Continuum robots, on the other hand, incorporate elastic structures and tendon/backbone actuation schemes, leading to a compact and flexible design [
55,
84–
90]. Two representative examples are shown in Fig. 8C. However, due to the inherent compliance of continuum structures, the payload capability is often deteriorated with an increased total length of the arm.
Additional sensors in robotic surgery
To further enhance the safety and functionality of a surgical robotic system, additional sensors were integrated. Two major categories are primarily involved: force sensing and supplementary visual modalities.
Force sensing
Force sensing in surgical robotic systems can generate faithful force feedback to the operating surgeon and can potentially help increase the safety of typical surgical tasks, including tissue manipulation [
91] and blunt dissection [
92].
Force sensing during surgery mainly include two types: (1) measuring the tissue gripping force and (2) measuring the interaction force with environment. Sensing approaches include extrinsic force sensing and intrinsic force sensing. Extrinsic force sensing integrates a sensing element at the location of contact and directly measures the force, whereas intrinsic force sensing, proposed in Reference [
93], uses actuator-level information to calculate the contact force, as shown in Fig. 9C. However, the former may provide a more accurate result. The increased complexity of sensor-integrated instruments may lead to reduced reliability, providing challenges to sterilization and higher costs.
Gripping force sensing using the extrinsic approach includes the use of a strain gauge [
94], polyvinylidene difluoride (PVDF) elements [
95], fiber Bragg grating (FBG) [
96], and capacitive sensing cells (Fig. 9A) [
97]. Given that a gripping force is always actively applied, the intrinsic sensing approach might be more suitable. Representative examples include measuring the driving shaft tensions [
51], driving pulley strains [
98], driving pulley torques [
99], and motor currents [
100].
Interaction force sensing with environment adopting the extrinsic approach includes the use of strain gauges [
101,
102] with one design (Fig. 9B) [
101], optical intensity [
103,
104], FBG [
105], and PVDF piezoelectric film [
106]. On the other hand, intrinsic sensing is implemented by measuring the backbone actuation forces [
107], motor currents [
108], and actuation pulley strains [
98]. Compared with the extrinsic sensing approach, the intrinsic sensing approach allows versatile extracorporeal sensor arrangement, which enables structural simplification of the inserted instruments. However, intrinsic force sensing through actuation cables or motor currents is affected by friction and moving inertia. Thus, the accuracy may be limited.
Supplementary visual modality
Endoscopic imaging is the basic visual modality in keyhole surgery, and it provides a direct view of the targeted surgical site. Besides endoscopic imaging, other visual modalities are often integrated to provide richer intracorporeal information at the anatomical, tissue, or cellular levels to improve visualization and facilitate surgical treatments.
Surface reconstruction
Surface reconstruction provides spatial information about the tissue surface geometry, which can be used for depth perception and surgical navigation [
110,
111]. Several approaches can be used for 3D surface reconstruction of a surgical scene intraoperatively, including the use of stereoscopes [
112–
114] or structured lights [
115–
117]. However, the miniaturization can be challenging due to a minimal requirement of a baseline distance [
118].
The 3D shape of the environment can also be obtained from the motion of a single endoscope, including the Shape-from-Motion (SfM) [
119] and Shape-from-Shading (SfS) techniques [
120]. In SfM, a 3D model template is first constructed from multiple views of the tissue, and tissue deformation is detected from the motions of the point cloud. In SfS, the depth of each pixel is computed by relating the pixel’s brightness to its normal surface direction with a reflectance model. These techniques can be readily applied with current laparoscopic hardware, but the performances rely on the feature correspondence or the validity of the reflectance model [
118].
Simultaneous localization and mapping (SLAM), which updates an environment map via a camera’s view while tracking the camera’s location, has also been utilized in surgical surface reconstruction [
121,
122]. SLAM is appealing in the surgical scene due to its real-time capability even while using a standard laparoscope. Recent works that addressed the challenges such as tracking deformable tissues and robust feature matching [
123,
124] have made SLAM even more promising (Fig. 10A).
Fluorescence and spectral imaging
Near-infrared fluorescence (NIRF) imaging improves contrasted views of specific surgical sites against surrounding tissues. Near-infrared light (700–900 nm) can travel up to centimeters through tissues to reach the fluorescent contrast agents, which have been developed for different types of targets according to the absorption and scattering properties of the tissue components [
128]. The applications of NIRF include sentinel lymph-node mapping, identification of vascular and biliary anatomy, and assessment of organ and tissue perfusion [
125] (Fig. 10B1). Fluorescence is integrated into the da Vinci surgical system and was found to be highly useful [
129]. Specific surgical procedures employing NIRF include partial nephrectomy [
130], cholecystectomy [
131], thymectomy [
132], lymphadenectomy [
133], and intestinal anastomosis [
134]. Another application is to use NIRF as a positioning marker. For example, supervised autonomous suturing was performed on soft tissues under the guidance of NIRF markers [
135].
Spectral imaging acquires multiple images of the tissue at different wavelengths to reveal tissue characteristics. Multispectral imaging and hyperspectral imaging differ from each other in spectral resolution, band quantity, band width, and band contiguousness [
136]. Multispectral images are taken in a time span of several hundred milliseconds, during which the tissue and the camera shall remain steady with respect to each other, making it challenging to apply in a surgical setting [
137]. Deblurring algorithms were proposed to improve multispectral images affected by tissue/camera movements [
138]. Hyperspectral imaging was used to characterize oxygenation during robotic partial nephrectomy [
139]. Given that spectral imaging better reveals the tissue-specific optical characteristics compared with RGB imaging, it is also employed in automated tissue classification [
126,
140], as shown in Fig. 10B2.
Confocal endomicroscopy
Confocal endomicroscopy, which is an imaging technique for increasing resolution via the use of a spatial pinhole to reduce out-of-focus lights, enables
in vivo histopathology by providing cellular level information and serves as an “optical biopsy.” The advantages of confocal endomicroscopy are evident: it provides high-resolution, real-time, dynamic images of tissues in a noninvasive manner and can be used to examine large areas during surgery. Probe-based confocal laser endomicroscopy (pCLE) [
141] (Fig. 10C) is also promising due to its high-speed imaging, mosaicing algorithms, and robotic instrument/probe control [
142]. In particular, robot-assisted mosaicing (i.e., stitching adjacent image frames) has been widely investigated for large area tissue scanning using pCLE [
127,
143], whereas force feedback [
144] and visual servoing [
145] are used to ensure a stable and robust tissue contact. Projecting of the pCLE images back to the endoscopic footage was made possible using the tactile-based 3D surface reconstruction [
146] and SLAM [
147], allowing potential lesions to be conveniently targeted for subsequent treatments.
Control approaches
Most existing robotic systems for keyhole and endoscopic procedures utilize a master–slave teleoperation paradigm [
148], where the patient-side manipulators are on the slave side and the surgeon console is on the master side. The movements of the patient-side slave manipulators shall follow the trajectory commands from the master (usually haptic) devices. The advances in teleoperation are reported in the “Teleoperation” section. However, teleoperation has zero level of autonomy in a robot-assisted surgery [
149,
150]. Exploitations of autonomy on higher levels can be of greater assistance to surgeons, and the state-of-the-art progresses are summarized in the “System autonomy and surgical automation” section.
Teleoperation
Although the early studies on teleoperation for laparoscopy were conducted over long physical distances [
151], in practice, the master and slave sides usually are in the same room. Currently, teleoperation can enhance surgeons’ capabilities, such as hand tremor elimination, motion scaling, and continued inputs for fully exploiting superhuman tool dexterity [
3]. A review on master haptic devices is presented in the “Haptic devices” section, whereas the controls involved in teleoperation are discussed in the “Bilateral control” and “Shared control” sections.
Haptic devices
Haptic feedback includes cutaneous (tactile) and kinesthetic (force) information [
152], which can potentially improve the performance of delicate surgical tasks that involve interactions between the surgical manipulator and the environment (e.g., suturing or dissection) [
91,
153]. Haptic feedback to operating surgeons is supported by haptic devices, which generate kinesthetic interactions for the surgeons to perceive the remote environment. Haptic devices sense the positions and/or orientations and generate forces and/or torques.
According to the capabilities of the sensing inputs and the haptic (force/torque) outputs, haptic devices can be divided into three categories as follows:
(1) 3-DoF inputs (usually positions) and 3-DoF outputs (usually forces): Commercial products include the delta.3 and omega.3 devices (Force Dimension) and the Novint Falcon (Novint Technologies). Research prototypes include the SHaDe device [
154] (orientation inputs and torque outputs), joystick mechanism [
155], and DELTA-R device [
156] (Fig. 11A).
(2) 6-DoF inputs (positions and orientations) and 3-DoF outputs (usually forces): Commercial products include the Touch and TouchX devices (3D Systems) and the omega.6 device (Force Dimension). Laparoscopic interface [
157] is one typical research prototype (Fig. 11B).
(3) 6-DoF inputs and multi-DoF outputs (five or six force/torque components): Commercial products include the delta.7 and sigma.7 devices (Force Dimension) and the Phantom Premium (3D Systems). Research prototypes include the haptic pen [
158], VISHARD6 [
159], PATHOS-II [
160], haptic cobot [
161], modified DELTA-R device [
156], pinch–grasp haptic interface [
162], VirtuaPower device [
163] (Fig. 11C), and CombX device [
164].
Bilateral control
Two types of control architectures are commonly adopted in surgical teleoperation: unilateral and bilateral control. In unilateral control, the slave motions are directly specified by the master. Unilateral control is an effective and straightforward approach that has been widely implemented in robotic surgery systems (e.g., in the da Vinci system).
The main drawback of unilateral control is the absence of force feedback. Bilateral control, on the other hand, refers to the control of master–slave systems with force/position information exchange. The design considerations for the bilateral controller are to maintain the stability and transparency of the closed-loop system [
165]. Studies have shown that perfect transparency is not possible in practice as it requires the exact knowledge of the dynamics of master–slave system that transmits force and position in both directions without communication delay [
166]. The ideally transparent system is marginally stable, and increased stability robustness is achieved by reducing the bandwidth of accurate transparency, causing a trade-off between stability and transparency in the design of bilateral controllers [
167]. Due to the complexity of sensor integration and control analysis, control architectures with reduced number of channels for sensory information exchange are commonly adopted. In telesurgery, the slave is usually under position control, which leads to an impedance type or a direct force feedback type implementation of the bilateral controller.
The impedance control (also referred to as the position-error-based control [
168]) is a sensor-less control architecture that reflects a force from the difference between the desired and actual positions of the slave robot to the operator. This force is an indication of the interaction between the slave robot and the environment only when the friction and inertia are low [
152]. Therefore, this method is difficult to be applied in detecting interactions with soft tissue, and friction compensation is needed to improve the transparency of such bilateral systems [
169,
170].
Direct force feedback, on the other hand, requires the measurement of the interaction forces between the slave and the environment [
171–
173]. It provides better performance in position and force tracking than position-error-based control [
174,
175]. However, practical challenges are encountered in force sensor integration due to miniaturization, sterilization, and biocompatibility issues.
Shared control
In shared control, the slave is teleoperated by the operator under an active assistance of the robot. A typical application of shared control in telesurgery is the use of virtual fixture [
176], which generates force and position signals as assistances to improve safety and accuracy, while the surgeon remains in control. Two types of virtual fixtures are commonly adopted: guidance virtual fixtures and forbidden-region virtual fixtures [
177]. For example, virtual fixtures have been used to provide guidance to target anatomy [
178] and motion constraints on robot-assisted suturing [
179]. Virtual fixtures have also been used for surgical training [
180] and co-manipulation [
181]. User studies show that virtual fixtures improve surgical performances during procedures such as suturing, needle passing, and knot tying [
182,
183].
System autonomy and surgical automation
The current development of artificial intelligence is far from being capable to support autonomous surgery. Partial surgical autonomy can benefit surgical treatments by automating repetitive tasks and let surgeons concentrate on critical operations [
150].
The control architectures of teleoperated surgical systems can be classified as direct/bilateral control, shared control, and supervisory control depending on the degree of user interaction. Yang
et al. proposed a framework of autonomy levels, in which medical robots have no autonomy, assistance autonomy, task autonomy, conditional autonomy, high autonomy, and full autonomy, corresponding to autonomy level 0 to 5, respectively [
149]. A similar six-level scaling is proposed in Reference [
184], which was developed from the discussions and a technical report of an ISO/IEC Joint Working Group.
Although supervised autonomy has been successfully applied in some robot-assisted surgeries such as orthopedic surgery and neurosurgery, the level of autonomy for keyhole surgery remains low (mostly teleoperation). With the growing number of surgical procedures with a massive quantity of available surgical data, standardized autonomy rating and classification are desired in order to guide the development of future surgical robots [
149,
185].
Surgical robotic systems with a high level of autonomy, as delineated by the above standards, are able to make (some) decisions under the supervision of a surgeon. The automation attempts are hindered by challenges in acquiring information and executing tasks in surgical environments with soft tissues or moving organs. Emerging studies have presented technical progresses in these aspects that potentially enable the development of next-generation cognitive surgical systems [
186]. In the rest of this section, the state-of-the-art surgical autonomy for keyhole surgeries will be reported.
Information acquisition
Visual modality provides a direct approach to the intracorporeal scene, which generally includes organs, tissues, and surgical instruments. Information acquisition refers to the extraction of informative segments from the scene and their interpretation as semantic or kinematic data, based on which the robotic surgical system executes surgical tasks autonomously.
Detecting the presences, determining the positions, and tracking the trajectories of surgical instruments are of importance not only in extending surgeons’ capabilities in teleoperation, but also in better facilitating autonomous surgery [
187]. The presence detection approaches adopted include color distinction [
188], geometric feature matching [
189], color and texture features [
190], and radiofrequency identification [
191]. Positioning and tracking methods can be marker-based, for example, combining laser pointers and optical markers [
192] and using specially designed black/white patterns [
193]. Markerless approaches are also possible, for example, relying on the end effector geometry and online kinematic data for pose estimation [
194,
195]. Learning-based methods include a probabilistic condensation algorithm relying on the priori geometric knowledge of the instruments [
196], kinematics-combined randomized trees [
197,
198], shape-dependent feature descriptor for pose estimation [
199], and deep-learning-based instrument segmentation [
200,
201].
Surgeries involving soft tissues have not been performed autonomously due to the lack of vision in tracking and distinguishing tissues in dynamic surgical environments [
202]. Given that fluorescence imaging or multispectral imaging increases system complexity and cost, autonomous segmentation or tracking of organs using general endoscopic imaging is still actively investigated. In methods that utilize preoperative priors [
203,
204], the organs are usually segmented in the preoperative CT model, and the registration of the model to endoscopic images is performed to account for organs’ motion and deformation. Other works used image processing approaches or machine learning strategies without prior information. The image processing approaches include basic thresholding and merging [
205], homogeneity and hue [
206], gradient-based methods [
207], and optical flow [
208]. Segmentation based on machine learning has been achieved using random forest [
209], support vector machines [
210], and fully convolutional neural networks [
211]. This method is promising due to the increasing amount of available laparoscopic data.
Another topic is the analysis of instrument motions recorded from surgical procedures, which is also known as surgical task segmentation. It is extensively used for surgical skill assessment [
212,
213] or constructing finite state machines for automation [
214,
215]. Early studies have focused on supervised task segmentation [
216–
218], where a set of predefined surgical motion sets with explicit semantic sense is required. Given that laparoscopic surgical gestures do not follow predefined patterns and involve temporal variations, the supervised approaches lack the acuity in detecting surgeon’s gestures. In addition, manual annotation of the training data set is impractical when the data become massive. Hence, recent studies often treat the surgical task segmentation in an unsupervised fashion [
219–
221]. Improved robustness to looping (a.k.a. failures and repetitions in surgical procedures) and noises was reported based on evaluations on specific surgical tasks [
221] using unsupervised segmentation.
Autonomous planning and execution of surgical subtasks
Suturing and knot tying are two fundamental tasks in minimally invasive surgery (MIS) due to their time consumption and high risks of injuring the related unstructured in vivo environment. The autonomy focusing on these two tasks is indeed challenging considering the thread’s flexibility, position, and tension; tissue deformations; and constrained workspace.
The first robotic suturing in MIS was investigated in Reference [
222]. EndoBot’s proposed autonomous robotic suturing algorithms were based on observations of manual suturing operations, which were divided into stitching, creating a suture loop, developing a knot, and securing a knot. To minimize the task uncertainty from the tissue deformations and the pose of the suture needle, other path planning approaches were subsequently proposed, including relying on kinematic analysis and geometric modeling of the stitching task [
223], creating an analytical solution from manual suturing [
224], and sequential convex programming [
225].
To accommodate the uncertainty and adopt online change, a few alternative approaches were proposed by using (1) human guidance, where a laser pointer was maneuvered by a surgeon to pinpoint the entry for an automatic stitching under visual servoing [
226]; (2) fluorescent imaging, where near-infrared fluorescent imaging was used to detect and track soft tissue deformations and automatically compute stitch arrangement [
202,
227]; and (3) tissue deformation modeling, where a penetration-induced deformation matrix was introduced to adaptively estimate suturing trajectories [
228].
On the other hand, learning-based approaches were proposed for motion planning of surgical subtasks [
150]. Examples include the RNN-based autonomous knot tying using the EndoPAR robot [
229], learning by demonstrations and decomposing demonstrations into meaningful primitives [
230], apprenticeship learning under a variant of iterative learning control [
231], and non-rigid registration mapping between the demonstration scene and the test scene [
232].
Besides suturing and knot tying, other autonomous surgical subtasks were also attempted, including multilateral debridement [
233], multilateral cutting of 3D viscoelastic and 2D orthotropic tissue phantoms [
214], electro-surgery [
234], palpation [
235], and blunt dissection [
236]. All these subtasks can be potentially fitted within the framework for hierarchical subtask execution planning, which was proposed in Reference [
237]. Such an integrated effort is expected to eventually bring all these pilot autonomous functions into one working system.
Challenges and future perspectives
In multi-port surgical procedures, robotic assistance, which had brought high-definition stereo imaging, enhanced dexterity, and intuitive and fine control of instruments, gradually gained acceptance across the world with its applications in various surgical departments, including general surgery, urology, cardiothoracic surgery, and gynecology. Suitable technologies for imaging, system design, and robotic manipulation in multi-port procedures have also progressively arrived at consensus.
On the other hand, a movable vision module with illumination and two to three surgical manipulators have to be deployed to a surgical site via a straight or curved access channel in single-port and NOTES/endoscopic procedures, as shown in Fig. 3C. The clinical need for robotic assistance in these procedures is even more apparent than that for multi-port procedures. However, even though few systems have received clinical clearance (e.g., the da Vinci SP system), the effectiveness and durability of these robotic systems have yet to be tested. One fundamental reason is that the proximal joints in these intracorporeal manipulators are under extreme size constrain with a very high actuation requirement because an external load on the end effector has a much longer moment arm than that of the actuation force from the actuation element inside the manipulator. Furthermore, enough space and perhaps a central lumen for actuation of the distal joints and electrosurgical/mechanical end effectors (e.g., needle drivers and bipolar graspers) should be reserved in the proximal joints. It is hence extremely challenging to design and realize such an intracorporeal manipulator. Many technical approaches are still being actively explored as discussed in the “Systems for single-port procedures” and “Systems for endoscopic and NOTES procedures” sections.
A recent opinion is to design specific systems for otorhinolaryngologic, transurethral, or gastrointestinal procedures. Given the functional and manipulation similarity (i.e., one vision unit and two tele-operated manipulators), one general system can be more welcomed. However, the core problem is that whether such a universal, modular, and scalable solution (including materials, structures, and actuations) can be developed to fulfill all these needs.
Endoscopic imaging heavily depends on the technological readiness of chip fabrication techniques in industrial electronics. On the other hand, multispectral imaging, which only needs limited modification to existing endoscopic imaging system with added controls over exposure and illumination, would likely continue to increase its clinical uses, providing high-definition images on multiple scales to facilitate surgical operations.
Besides expensive and complex surgical robotic systems, intelligent hand-held surgical instruments (e.g., the ones in References [
238,
239]) can also trigger future studies. Light weight and control intuitiveness are of paramount importance for these instruments to gain clinical acceptance.
While exploring new designs to address the aforementioned needs in single-port and NOTES/endoscopic procedures, enough attention should be directed toward physical forms and control realizations of the new designs.
Physical forms mainly involve the utilized materials and adopted structure topologies. The intended reduction in instrument invasiveness with payload and precision requirements as high as possible needs a design that is compact and strong. This determines the need for materials with high Young’s modulus. Soft materials with low Young’s modulus may only fit in implantable devices that interact with organs and tissues on a macroscale (e.g., the soft heart sleeve [
240]).
Articulated manipulators in single-port and endoscopic procedures may suffer from actuation deficiency. In cable-driven designs, proper tension keeping for adequate actuation can be challenging, whereas the use of embedded motors and rigid-linked linkages leads to design bulkiness and limited dexterity. Continuum mechanisms, on the other hand, deform the structure to transmit motions and forces. The dual roles of structural and transmission components can bring design compactness and expand the applications of continuum medical robots [
11]. Superelastic materials that allow large deformation shall be used. However, bending in a continuum structure has finite radius, which may not be able to deliver the required dexterity in tightly confined spaces. Using continuum-articulated hybrid structure, which utilizes the advantages from both structural topologies, may be a promising direction.
Regarding the control realization of future developments of surgical robotic systems, teleoperation will remain the main approach for the near future. Given the advances in communication technology (e.g., 5G), increased control information exchanges will be implemented, leading to increased teleoperation transparency.
Current artificial intelligence framework might only facilitate assistive surgical tasks, such as supervised suturing and knot tying. This may be due to the fact that the current data-driven approaches essentially generate an output by encoding thousands and voluminous particular samples. A new architect for reasoning is eventually needed to safely and properly handle patient-specific surgical operations.
Conclusions
In keyhole and endoscopic surgical procedures, robotics has greatly improved ergonomics, manipulation dexterity, and intuitiveness. The gradual global acceptance indicates the improved clinical outcomes. Given the history on how robotics has revolutionized various industrial sectors, the consensus is quite evident: the presence of robotics in medicine will inevitably increase. Future robotic assistance in surgery is expected to become a surgeon’s extended hands, eyes, and even mind in delivering delicate patient-specific treatments such that patients can benefit from the implementations of surgical robotic systems.